Skip to content

Speed up (filtered) KNN queries for flat vector fields #130251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jun 30, 2025

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Jun 27, 2025

For dense vector fields using the flat index, we already know a brute-force search will be used, so there’s no need to go through the codec’s approximate KNN logic. This change skips that step and builds the brute-force query directly, making things faster and simpler.

I tested this on a setup with 10 million random vectors, each with 1596 dimensions and 17,500 partitions, using the random_vector track. The results:

Performance Comparison

Metric Before After Change
Throughput 221 ops/s 2762 ops/s 🟢 +1149%
Latency (p50) 29.2 ms 1.6 ms 🔻 -94.4%
Latency (p99) 81.6 ms 3.5 ms 🔻 -95.7%

Filtered KNN queries on flat vectors are now over 10x faster on my laptop!

For dense vector fields using the `flat` index, we already know a brute-force search will be used—so there’s no need to go through the codec’s approximate KNN logic. This change skips that step and builds the brute-force query directly, making things faster and simpler.

I tested this on a setup with **10 million random vectors**, each with **1596 dimensions** and **17,500 partitions**, using the `random_vector` track.
The results:

### Performance Comparison

| Metric            | Before    | After      | Change    |
| ----------------- | --------- | ---------- | --------- |
| **Throughput**    | 221 ops/s | 2762 ops/s | 🟢 +1149% |
| **Latency (p50)** | 29.2 ms   | 1.6 ms     | 🔻 -94.4% |
| **Latency (p99)** | 81.6 ms   | 3.5 ms     | 🔻 -95.7% |

Filtered KNN queries on flat vectors are now over 10x faster on my laptop!
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jun 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @jimczi, I've created a changelog YAML for you.

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am loving these numbers. Thank you for digging into this!

@jimczi
Copy link
Contributor Author

jimczi commented Jun 30, 2025

I tweaked the brute-force nested version so that it always diversifies the child docs. It’s similar in spirit to what we do for HNSW, but much faster here since we can just walk through each parent’s block in order.

I ran a comparison between main and this branch using the nested mode in the random_vector track (added here)

|                                                Min Throughput | brute-force-filtered-search | 2052.39        |  85.0901      | -1967.3    |  ops/s |   -95.85% |
|                                               Mean Throughput | brute-force-filtered-search | 2214.2         |  87.3678      | -2126.83   |  ops/s |   -96.05% |
|                                             Median Throughput | brute-force-filtered-search | 2240.51        |  87.3315      | -2153.18   |  ops/s |   -96.10% |
|                                                Max Throughput | brute-force-filtered-search | 2250.5         |  88.3819      | -2162.12   |  ops/s |   -96.07% |
|                                       50th percentile latency | brute-force-filtered-search |    2.2724      |  79.5413      |    77.2689 |     ms | +3400.33% |
|                                       90th percentile latency | brute-force-filtered-search |    2.49625     | 136.674       |   134.178  |     ms | +5375.18% |
|                                       99th percentile latency | brute-force-filtered-search |    8.14283     | 199.337       |   191.195  |     ms | +2348.01% |
|                                     99.9th percentile latency | brute-force-filtered-search |   50.8515      | 265.505       |   214.654  |     ms |  +422.12% |
|                                    99.99th percentile latency | brute-force-filtered-search |   70.5917      | 359.416       |   288.824  |     ms |  +409.15% |
|                                      100th percentile latency | brute-force-filtered-search |   74.4209      | 438.119       |   363.698  |     ms |  +488.70% |
|                                  50th percentile service time | brute-force-filtered-search |    2.2724      |  79.5413      |    77.2689 |     ms | +3400.33% |
|                                  90th percentile service time | brute-force-filtered-search |    2.49625     | 136.674       |   134.178  |     ms | +5375.18% |
|                                  99th percentile service time | brute-force-filtered-search |    8.14283     | 199.337       |   191.195  |     ms | +2348.01% |
|                                99.9th percentile service time | brute-force-filtered-search |   50.8515      | 265.505       |   214.654  |     ms |  +422.12% |
|                               99.99th percentile service time | brute-force-filtered-search |   70.5917      | 359.416       |   288.824  |     ms |  +409.15% |
|                                 100th percentile service time | brute-force-filtered-search |   74.4209      | 438.119       |   363.698  |     ms |  +488.70% |
|                                                    error rate | brute-force-filtered-search |    0           |   0           |     0      |      % |     0.00% |

As you can see, the performance drops off pretty hard in the nested filtered case, lots of room for improvement. I think we should tackle that in a follow-up. From profiling, stuff like TimeOutCheckingBits is showing up a lot, so there’s some easy wins we can go after.

elasticsearchmachine pushed a commit that referenced this pull request Jun 30, 2025
…130263)

PR #130251 made me realize
we were missing some important coverage.

This adds nested vector query (and top level knn) tests for flat indices
in our yaml tests.
Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, if this passes given the new yaml tests, I think this is g2g!

@jimczi
Copy link
Contributor Author

jimczi commented Jun 30, 2025

OK, if this passes given the new yaml tests, I think this is g2g!

Well it's broken ;)
Let me check what's going on.

@jimczi
Copy link
Contributor Author

jimczi commented Jun 30, 2025

Ok that was an issue when the diversifying query hits NO_MORE_DOCS.
Let's wait for this another round.

@jimczi jimczi merged commit 2142915 into elastic:main Jun 30, 2025
32 checks passed
@jimczi jimczi deleted the brute_force_knn_optim branch June 30, 2025 18:19
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 3, 2025
…lastic#130263)

PR elastic#130251 made me realize
we were missing some important coverage.

This adds nested vector query (and top level knn) tests for flat indices
in our yaml tests.
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 3, 2025
For dense vector fields using the `flat` index, we already know a brute-force search will be used—so there’s no need to go through the codec’s approximate KNN logic. This change skips that step and builds the brute-force query directly, making things faster and simpler.

I tested this on a setup with **10 million random vectors**, each with **1596 dimensions** and **17,500 partitions**, using the `random_vector` track.
The results:

### Performance Comparison

| Metric            | Before    | After      | Change    |
| ----------------- | --------- | ---------- | --------- |
| **Throughput**    | 221 ops/s | 2762 ops/s | 🟢 +1149% |
| **Latency (p50)** | 29.2 ms   | 1.6 ms     | 🔻 -94.4% |
| **Latency (p99)** | 81.6 ms   | 3.5 ms     | 🔻 -95.7% |

Filtered KNN queries on flat vectors are now over 10x faster on my laptop!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants